On feature selection in maximum entropy approach to statistical concept-based speech-to-speech translation
نویسندگان
چکیده
Feature selection is critical to the performance of maximumentropy-based statistical concept-based spoken language translation. The source language spoken message is first parsed into a structured conceptual tree, and then generated into the target language based on maximum entropy modeling. To improve feature selection in this maximum entropy approach, a new concept-word feature is proposed, which exploits both concept-level and word-level information. It thus enables the design of concise yet informative concept sets and easies both annotation and parsing efforts. The concept generation error rate is reduced by over 90% on training set and 7% on test set in our speech translation corpus within limited domains. To alleviate data sparseness problem, multiple feature sets are proposed and employed, which achieves 10%-14% further error rate reduction. Improvements are also achieved in our experiments on speech-to-speech translation.
منابع مشابه
Statistical Natural Language Generation for Speech-to-Speech Machine Translation
This paper presents a statistical natural language generation scheme for trainable speech-to-speech machine translation (MT) systems for limited domain applications using a cascaded approach. The natural language generation scheme in the translation systems is based on a maximum entropy (ME) statistical model fully trained from a corpus, allowing flexible translation outputs. In this paper, the...
متن کاملStatistical natural language generation for speech-to-speech machine translation systems
This paper presents a statistical natural language generation scheme for trainable speech-to-speech machine translation (MT) systems for limited domain applications using a cascaded approach. The natural language generation scheme in the translation systems is based on a maximum entropy (ME) statistical model fully trained from a corpus, allowing flexible translation outputs. In this paper, the...
متن کاملStudy on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملUse of maximum entropy in natural word generation for statistical concept-based speech-to-speech translation
Our statistical concept-based spoken language translation method consists of three cascaded components: natural language understanding, natural concept generation and natural word generation. In the previous approaches, statistical models are used only in the first two components. In this paper, a novel maximum-entropy-based statistical natural word generation algorithm is proposed that takes i...
متن کاملImproving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms
One of the important issues in speech emotion recognizing is selecting of appropriate feature sets in order to improve the detection rate and classification accuracy. In last studies researchers tried to select the appropriate features for classification by using the selecting and reducing the space of features methods, such as the Fisher and PCA. In this research, a hybrid evolutionary algorit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004